Skip to content

Conversation

yaugenst-flex
Copy link
Collaborator

@yaugenst-flex yaugenst-flex commented Oct 8, 2025

Summary

  • serialize the adjoint traced-key paths into the forward simulation’s HDF5 metadata (__tidy3d_traced_field_keys__) so cache hashes differ when the tracked set changes
  • move the Tracer, FieldMap, and TracerKeys helpers into tidy3d/components/autograd and reuse them from both the serializer and the web pipeline
  • add a regression test confirming simulations with identical statics but different tracer sets now hash differently

Alternatives Considered

  • keep relying on the existing sidecar TracerKeys file: this leaves the cache blind to tracer changes, because the hash sees identical HDF5 exports
  • expose traced keys as a new model field: that would surface runtime autograd state in the public schema even though users neither set nor persist it
  • store the paths in model.attrs: those attrs are user-controlled metadata and cached JSON already assumes they are non-functional

Rationale

Writing the traced-key summary to an HDF5 attribute keeps the metadata with the exported simulation, feeds into the hashing logic, and avoids making autograd bookkeeping part of the user-facing schema.

Greptile Overview

Updated On: 2025-10-08 08:26:25 UTC

Summary

This PR fixes a critical autograd cache bug where simulations with identical static parameters but different traced field sets would incorrectly share cache entries. The core issue was that the cache hash only considered the HDF5 file contents (which are identical for the same static simulation) but ignored the sidecar `TracerKeys` file that tracks which fields are being traced by autograd.

The solution serializes traced field keys directly into the HDF5 metadata using a new attribute __tidy3d_traced_field_keys__. This ensures that simulations tracking different autograd parameters generate different cache hashes, preventing incorrect cache hits that could lead to wrong gradient computations in optimization workflows.

The implementation includes a significant refactoring that moves the Tracer, FieldMap, and TracerKeys helper classes from tidy3d/web/api/autograd/utils.py into a new shared module tidy3d/components/autograd/field_map.py. This centralization enables code reuse between the web pipeline and the serialization components while maintaining consistency in traced field metadata handling.

A new regression test test_sim_hash_changes_with_traced_keys() validates that when the traced field set changes (by converting a structure to static), both the field map and the simulation hash change accordingly, ensuring proper cache invalidation.

Important Files Changed

Changed Files
Filename Score Overview
tests/test_components/autograd/test_autograd.py 5/5 Adds regression test validating hash changes when traced keys differ
tidy3d/components/base.py 4/5 Implements core fix by adding traced field keys to HDF5 metadata
tidy3d/web/api/autograd/utils.py 5/5 Refactors to import helper classes from centralized location
tidy3d/components/autograd/field_map.py 4/5 New module containing extracted autograd field mapping utilities
tidy3d/web/api/autograd/io_utils.py 5/5 Updates imports to use centralized field mapping classes

Confidence score: 4/5

  • This PR addresses a critical correctness issue in autograd caching with a well-reasoned solution
  • Score reflects solid implementation with proper testing, though the core change touches critical caching logic
  • Pay close attention to tidy3d/components/base.py where the HDF5 metadata serialization is implemented

Sequence Diagram

sequenceDiagram
    participant User
    participant Simulation
    participant BaseModel as "Tidy3dBaseModel"
    participant FieldMap as "FieldMap/TracerKeys"
    participant HDF5 as "HDF5 File"
    participant Cache as "Cache System"

    User->>Simulation: "Create simulation with traced fields"
    Simulation->>BaseModel: "_strip_traced_fields()"
    BaseModel-->>Simulation: "AutogradFieldMap"
    
    User->>Simulation: "Export to HDF5"
    Simulation->>BaseModel: "to_hdf5()"
    BaseModel->>BaseModel: "_serialized_traced_field_keys()"
    BaseModel->>FieldMap: "TracerKeys.from_field_mapping()"
    FieldMap-->>BaseModel: "TracerKeys instance"
    BaseModel->>FieldMap: "tracer_keys.json()"
    FieldMap-->>BaseModel: "JSON serialized keys"
    BaseModel->>HDF5: "Store traced keys in attributes"
    BaseModel->>HDF5: "Write simulation data"
    
    User->>Cache: "Check cache hash"
    Cache->>BaseModel: "_hash_self()"
    BaseModel->>BaseModel: "to_hdf5() (in memory)"
    Note over BaseModel,HDF5: "HDF5 now includes traced keys in metadata"
    BaseModel-->>Cache: "Hash includes traced keys"
    
    Note over User,Cache: "Different traced sets now produce different hashes"
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@yaugenst-flex yaugenst-flex force-pushed the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch from 161273f to a2fa0ec Compare October 8, 2025 08:34
Copy link
Contributor

github-actions bot commented Oct 8, 2025

Diff Coverage

Diff: origin/develop...HEAD, staged and unstaged changes

  • tidy3d/components/autograd/field_map.py (97.0%): Missing lines 62
  • tidy3d/components/base.py (100%)
  • tidy3d/web/api/autograd/autograd.py (100%)
  • tidy3d/web/api/autograd/io_utils.py (100%)

Summary

  • Total: 73 lines
  • Missing: 1 line
  • Coverage: 98%

tidy3d/components/autograd/field_map.py

  58     )
  59 
  60     def encoded_keys(self) -> list[str]:
  61         """Return the JSON-encoded representation of keys."""
! 62         return [_encoded_path(path) for path in self.keys]
  63 
  64     @classmethod
  65     def from_field_mapping(
  66         cls,

@yaugenst-flex yaugenst-flex force-pushed the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch from a2fa0ec to 7236cf8 Compare October 8, 2025 09:20
@yaugenst-flex yaugenst-flex force-pushed the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch from 7236cf8 to d4da92d Compare October 16, 2025 18:40
@yaugenst-flex yaugenst-flex added this pull request to the merge queue Oct 17, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 17, 2025
@yaugenst-flex yaugenst-flex force-pushed the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch from d4da92d to 99cdef1 Compare October 17, 2025 10:49
@yaugenst-flex yaugenst-flex force-pushed the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch from 99cdef1 to bf21c06 Compare October 17, 2025 11:00
@yaugenst-flex yaugenst-flex added this pull request to the merge queue Oct 17, 2025
Merged via the queue into develop with commit efa6d66 Oct 17, 2025
25 checks passed
@yaugenst-flex yaugenst-flex deleted the FXC-3376-hash-must-include-traced-keys-remote-autograd-cache-bug branch October 17, 2025 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants